Search results for "Text segmentation"
showing 8 items of 8 documents
Can colours be used to segment words when reading?
2015
Rayner, Fischer, and Pollatsek (1998, Vision Research) demonstrated that reading unspaced text in Indo-European languages produces a substantial reading cost in word identification (as deduced from an increased word-frequency effect on target words embedded in the unspaced vs. spaced sentences) and in eye movement guidance (as deduced from landing sites closer to the beginning of the words in unspaced sentences). However, the addition of spaces between words comes with a cost: nearby words may fall outside high-acuity central vision, thus reducing the potential benefits of parafoveal processing. In the present experiment, we introduced a salient visual cue intended to facilitate the process…
Speech- and sound-segmentation in dyslexia: evidence for a multiple-level cortical impairment
2006
Developmental dyslexia involves deficits in the visual and auditory domains, but is primarily characterized by an inability to translate the written linguistic code to the sound structure. Recent research has shown that auditory dysfunctions in dyslexia might originate from impairments in early pre-attentive processes, which affect behavioral discrimination. Previous studies have shown that whereas dyslexic individuals are deficient in discriminating sound distinctions involving consonants or simple pitch changes, discrimination of other sound aspects, such as tone duration, is intact. We hypothesized that such contrasts that can be discriminated by dyslexic individuals when heard in isolat…
Processing Continuous Speech in Infancy
2016
The present chapter focuses on fluent speech segmentation abilities in early language development. We first review studies exploring the early use of major prosodic boundary cues which allow infants to cut full utterances into smaller-sized sequences like clauses or phrases. We then summarize studies showing that word segmentation abilities emerge around 8 months, and rely on infants’ processing of various bottom-up word boundary cues and top-down known word recognition cues. Given that most of these cues are specific to the language infants are acquiring, we emphasize how the development of these abilities varies cross-linguistically, and explore their developmental origin. In particular, …
New evidence for chunk-based models in word segmentation.
2014
International audience; : There is large evidence that infants are able to exploit statistical cues to discover the words of their language. However, how they proceed to do so is the object of enduring debates. The prevalent position is that words are extracted from the prior computation of statistics, in particular the transitional probabilities between syllables. As an alternative, chunk-based models posit that the sensitivity to statistics results from other processes, whereby many potential chunks are considered as candidate words, then selected as a function of their relevance. These two classes of models have proven to be difficult to dissociate. We propose here a procedure, which lea…
Phrasal prosody constrains word segmentation in French 16-month-olds
2011
Infants who are in the process of acquiring their mother tongue have to find a way of segmenting the continuous speech stream into word-sized units. We present an experiment showing that French 16-month-olds are able to exploit phonological phrase boundaries in order to constrain lexical access. Using the conditioned head-turning technique, we showed that infants trained to turn their head for a bisyllabic word responded more often to sentences that contained this word, than to sentences that contained both syllables of this word separated by a phonological phrase boundary. We compare these results with similar results obtained with English-speaking infants, and discuss their implication fo…
Semi-automatic Quasi-morphological Word Segmentation for Neural Machine Translation
2018
This paper proposes the Prefix-Root-Postfix-Encoding (PRPE) algorithm, which performs close-to-morphological segmentation of words as part of text pre-processing in machine translation. PRPE is a cross-language algorithm requiring only minor tweaking to adapt it for any particular language, a property which makes it potentially useful for morphologically rich languages with no morphological analysers available. As a key part of the proposed algorithm we introduce the ‘Root alignment’ principle to extract potential sub-words from a corpus, as well as a special technique for constructing words from potential sub-words. We conducted experiments with two different neural machine translation sys…
Lexical and sublexical units in speech perception.
2009
Saffran, Newport, and Aslin (1996a) found that human infants are sensitive to statistical regularities corresponding to lexical units when hearing an artificial spoken language. Two sorts of segmentation strategies have been proposed to account for this early word-segmentation ability: bracketing strategies, in which infants are assumed to insert boundaries into continuous speech, and clustering strategies, in which infants are assumed to group certain speech sequences together into units (Swingley, 2005). In the present study, we test the predictions of two computational models instantiating each of these strategies i.e., Serial Recurrent Networks: Elman, 1990; and Parser: Perruchet & Vint…
A role for backward transitional probabilities in word segmentation?
2008
A number of studies have shown that people exploit transitional probabilities between successive syllables to segment a stream of artificial continuous speech into words. It is often assumed that what is actually exploited are the forward transitional probabilities (given XY, the probability that X will be followed by Y ), even though the backward transitional probabilities (the probability that Y has been preceded by X) were equally informative about word structure in the languages involved in those studies. In two experiments, we showed that participants were able to learn the words from an artificial speech stream when the only available cues were the backward transitional probabilities.…